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Abstract 

Given  a  formula  $  in  quantifier- free  Presburger  arithmetic,  it  is  well  known  that,  if  there  is  a 
satisfying  solution  to  $,  there  is  one  whose  size,  measured  in  bits,  is  polynomially  bounded  in 
the  size  of  $.  In  this  paper,  we  consider  a  special  class  of  quantifier- free  Presburger  formulas  in 
which  most  linear  constraints  are  separation  (difference-bound)  constraints,  and  the  non-separation 
constraints  are  sparse.  This  class  has  been  observed  to  commonly  occur  in  software  verification 
problems.  We  derive  a  new  solution  bound  in  terms  of  parameters  characterizing  the  sparseness  of 
linear  constraints  and  the  number  of  non-separation  constraints,  in  addition  to  traditional  measures 
of  formula  size.  In  particular,  the  number  of  bits  needed  per  integer  variable  is  linear  in  the  number 
of  non-separation  constraints  and  logarithmic  in  the  number  and  size  of  non-zero  coefficients  in 
them,  but  is  otherwise  independent  of  the  total  number  of  linear  constraints  in  the  formula.  The 
derived  bound  can  be  used  in  a  decision  procedure  based  on  instantiating  integer  variables  over 
a  finite  domain  and  translating  the  input  quantifier-free  Presburger  formula  to  an  equi-satisfiable 
Boolean  formula,  which  is  then  checked  using  a  Boolean  satisfiability  solver.  We  present  empirical 
evidence  indicating  that  this  method  can  greatly  outperform  other  decision  procedures. 
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1  Introduction 


Presburger  arithmetic  [27]  is  defined  as  the  first-order  theory  of  the  structure  (N,  0, 1,  ^,  +),  where 
N  denotes  the  set  of  natural  numbers.  The  satisfiability  problem  for  Presburger  arithmetic  is 
decidable,  but  of  super-exponential  worst-case  complexity  [13].  Fortunately,  for  many  applications, 
such  as  in  program  analysis  (e.g.,  [28])  and  hardware  verification  (e.g.,  [8]),  the  quantifier- free 
fragment  suffices. 

A  formula  $  in  quantifier- free  Presburger  arithmetic  (QFP)  is  constructed  by  combining  lin¬ 
ear  constraints  with  Boolean  operators  (A,  V,  -i).  Formally,  the  7th  constraint  is  of  the  form 
'y'j-i  ai,jxj  >  bi,  where  the  coefficients  and  the  constant  terms  are  integer  constants  and  the  vari¬ 
ables  xi,X2,--  -  , xn  are  integer- valued1.  In  this  paper,  we  are  concerned  with  the  satisfiability 
problem  for  QFP,  viz.,  that  of  finding  a  valuation  of  the  variables  such  that  $  evaluates  to  true. 
That  this  problem  is  in  NP,  and  hence  NP-complete,  can  be  concluded  from  the  result  that  integer 
linear  programming  is  in  NP  [6,  30,  17,  23]. 2 

Thus,  if  there  is  a  satisfying  solution  to  a  QFP  formula,  there  is  one  whose  size,  measured  in  bits, 
is  polynomially  bounded  in  the  problem  size.  Problem  size  is  traditionally  measured  in  terms  of 
the  parameters  m,  n,  log  ha,  and  log  Hb,  where  m  is  the  total  number  of  constraints  in  the  formula, 
n  is  the  number  of  variables,  and  ha  =  max^j)  \a^j\  and  Hb  =  max,  \bi\  are  upper  bounds  on  the 
absolute  values  of  coefficients  and  constant  terms  respectively. 

The  above  result  suggests  the  following  approach  to  checking  the  satisfiability  of  a  QFP  formula  <h: 

1.  Compute  the  polynomial  bound  S  on  solution  size. 

2.  Search  for  a  satisfying  solution  to  in  the  bounded  space  {0, 1, . . .  ,  2s  —  l}n. 

This  approach  has  been  successfully  applied  to  highly  restricted  sub-classes  of  QFP,  such  as  equality 
logic  [25]  and  separation  logic  [9] ,  and  is  termed  as  finite  instantiation.  The  basic  idea  is  to  translate 
$  to  a  Boolean  formula  by  encoding  each  integer  variable  as  a  vector  of  Boolean  variables  (a 
“symbolic  bit- vector”)  of  length  S.  The  resulting  Boolean  formula  is  checked  using  a  Boolean 
satisfiability  (SAT)  solver.  This  approach  leverages  the  dramatic  advances  in  SAT  solving  made  in 
recent  years  (e.g.,  [20,  15]).  It  is  straightforward  to  extend  the  approach  to  additionally  handle  the 
theory  of  uninterpreted  functions  and  equality,  by  using,  e.g.,  Ackermann’s  technique  of  eliminating 
function  applications  [1]. 

However,  a  naive  implementation  of  a  decision  procedure  based  on  finite  instantiation  fails  for 
QFP  formulas  encountered  in  practice.  The  problem  is  that  the  bound  on  solution  size,  S,  is 
0(log  m  +  log  Hb  +  m[log  m  +  log  ha])-  In  particular,  the  presence  of  the  mlog  m  term  means  that 
for  practical  problems  involving  hundreds  of  linear  constraints,  the  Boolean  formulas  generated  are 
likely  to  be  too  large  to  be  decided  by  present-day  SAT  solvers. 

In  this  paper,  we  explore  the  above  finite  instantiation-based  approach  to  deciding  QFP  formulas, 
but  with  a  focus  on  formulas  generated  in  software  verification.  It  has  been  observed,  by  us  and 
others,  that  QFP  formulas  from  this  domain  have: 

1While  Presburger  arithmetic  is  defined  over  N,  we  interpret  the  variables  over  Z  as  it  is  general  and  more 
suitable  for  applications.  It  is  straightforward  to  translate  a  formula  with  integer  variables  to  one  where  variables  are 
interpreted  over  N,  and  vice-versa,  by  adding  (linearly  many)  additional  variables  or  constraints. 

2The  NP-hardness  follows  from  a  straightforward  encoding  of  the  3SAT  problem  as  a  0-1  integer  linear  program. 
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Project 

Maximum  Fraction  of 
Non-Separation  Constraints 

Maximum  Width  of  a 
Non-Separation  Constraint 

Blast 

0.0276 

6 

Magic 

0.0032 

2 

MIT 

0.0087 

3 

WiSA 

0.0054 

4 

Table  1:  Linear  Arithmetic  Constraints  in  Software  Verification  are  Mostly  Separation 
Constraints.  For  each  software  verification  project,  the  maximum  fraction  of  non-separation 
constraints  is  shown,  as  well  as  the  maximum  width  of  a  non-separation  constraint,  where  the 
maximum  is  taken  over  all  formulas  in  the  set.  The  Blast  formulas  were  generated  from  device 
drivers  written  in  C,  the  Magic  formulas  from  an  implementation  of  openssl  written  in  C,  the 
MIT  formulas  from  Java  programs,  and  the  WiSA  formulas  were  generated  in  the  checking  of 
format  string  vulnerabilities. 

1.  Mainly  Separation  Constraints:  Of  the  m  constraints,  m  —  k  are  separation  constraints,  where 
k  <C  m.  Separation  constraints,  also  called  difference- bound  constraints ,  are  of  the  form 
Xi  —  Xj  co  bt  or  Xi  co  bt,  where  bt  is  an  integer  constant,  and  cog  {>,>,=,<,<}. 

2.  Sparse  Structure:  The  k  non-separation  constraints  are  sparse,  with  at  most  w  variables  per 
constraint,  where  w  is  “small”.  We  will  refer  to  w  as  the  width  of  the  constraint. 

Pratt  [26]  observed  that  most  inequalities  generated  in  program  verification  are  separation  con¬ 
straints.  More  recently,  the  authors  of  the  theorem  prover  Simplify  observed  in  the  context  of  the 
Extended  Static  Checker  for  Java  (ESC/Java)  project  that  “the  inequalities  that  occur  in  program 
checking  rarely  involve  more  than  two  or  three  terms”  [12].  We  have  performed  a  study  of  formulas 
selected  from  various  recent  software  verification  projects:  the  Blast  project  at  Berkeley  [16],  the 
Magic  project  at  CMU  [10],  the  Wisconsin  Safety  Analyzer  (WiSA)  project3,  and  the  software 
upgrade  checking  project  at  MIT  [19].  The  results  of  this  study,  indicated  in  Table  1,  support 
the  afore-mentioned  observations  regarding  the  “sparse,  mostly  separation”  nature  of  constraints 
in  QFP  formulas.  To  our  knowledge,  no  previous  decision  procedure  for  QFP  has  attempted  to 
exploit  this  problem  structure. 

We  make  the  following  novel  contributions  in  this  paper: 

•  We  derive  bounds  on  solutions  for  QFP  formulas,  not  only  in  terms  of  the  traditional  pa¬ 
rameters  m,  n,  pa,  and  pb,  but  also  in  terms  of  k  and  w.  In  particular,  we  show  that  the 
worst-case  number  of  bits  required  per  integer  variable  is  linear  in  k,  but  only  logarithmic  in 
w.  Unlike  previously  derived  bounds,  ours  is  independent  of  the  total  number  of  constraints 

m. 

•  We  use  the  derived  bounds  in  a  sound  and  complete  decision  procedure  for  QFP  based  on 
finite  instantiation,  and  present  empirical  evidence  that  our  method  can  greatly  outperform 
other  decision  procedures. 

Related  Work.  There  has  been  much  work  on  deciding  quantifier-free  Presburger  arithmetic;  we 
present  a  brief  discussion  here  and  refer  the  reader  to  a  recent  survey  [14]  for  more  details.  Recent 
techniques  fall  into  three  categories: 

lihttp :  //www.  cs  .wise  .  edu/wisa 
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•  The  first  class  comprises  procedures  targeted  towards  solving  conjunctions  of  constraints,  with 
disjunctions  handled  by  enumerating  terms  in  a  disjunctive  normal  form  (DNF).  Examples 
include  the  Omega  test  [28]  and  solvers  based  on  other  integer  linear  programming  techniques. 
The  drawback  of  these  methods  is  the  need  to  enumerate  the  potentially  exponentially  many 
terms  in  the  DNF  representation. 

•  The  second  set  of  methods  attempt  to  remedy  this  problem  by  instead  relying  on  modern 
SAT  solving  strategies.  The  approach  works  as  follows.  A  Boolean  abstraction  of  the  QFP 
formula  is  generated  by  replacing  each  linear  constraint  with  a  corresponding  Boolean 
variable.  If  the  abstraction  is  unsatisfiable,  then  so  is  $.  If  not,  the  satisfying  assignment 
(model)  is  checked  for  consistency  with  the  theory  of  quantifier-free  Presburger  arithmetic, 
using  a  ground  decision  procedure  for  conjunctions  of  linear  constraints.  Assignments  that 
are  inconsistent  are  excluded  from  later  consideration  by  adding  a  “lemma”  to  the  Boolean 
abstraction.  The  process  continues  until  either  a  consistent  assignment  is  found,  or  all  (ex¬ 
ponentially  many)  assignments  have  been  explored.  Examples  of  decision  procedures  in  this 
class  that  have  some  support  for  QFP  include  CVC  [2,  3]  and  ICS  [11].  These  provers  employ 
the  Nelson-Oppen  architecture  for  cooperating  decision  procedures  [22],  or  some  variant  of 
it.  Note  that  the  original  Nelson-Oppen  framework  was  only  defined  for  disjoint  theories. 
In  order  to  exploit  the  mostly-separation  structure  of  a  formula,  one  approach  could  be  to 
combine  a  decision  procedure  for  a  theory  of  separation  constraints  with  one  for  a  theory  of 
non-separation  constraints,  but  this  needs  an  extension  of  the  Nelson-Oppen  framework  to 
apply  to  these  non-disjoint  theories. 

•  The  final  class  of  methods  are  based  on  finite  automata  theory  (e.g.,  [31,  14]).  The  basic 
idea  is  to  construct  a  finite  automaton  corresponding  to  the  input  QFP  formula  <h,  such  that 
language  accepted  by  the  automaton  consists  of  the  binary  encodings  of  satisfying  solutions  of 
T.  According  to  a  recent  experimental  evaluation  with  other  methods  [14],  these  techniques 
are  better  than  others  at  solving  formulas  with  very  large  coefficients,  but  do  not  scale  well 
with  the  number  of  variables  and  constraints.4 

The  approach  we  present  in  this  paper  is  distinct  from  the  categories  mentioned  above.  In  particular, 
the  following  unique  features  differentiate  it  from  previous  methods: 

•  It  is  the  first  finite  instantiation  method,  translating  a  QFP  formula  to  SAT  in  a  single  step. 
The  clear  separation  between  the  translation  and  the  SAT  solving  allows  us  to  leverage  future 
advances  in  SAT  solving  far  more  easily  than  other  SAT-based  procedures. 

•  It  is  the  first  technique,  to  the  best  of  our  knowledge,  that  exploits  the  structure  of  formulas 
commonly  encountered  in  software  verification. 

Outline  of  the  paper.  The  rest  of  this  paper  is  organized  as  follows.  In  Section  2,  we  discuss 
background  material  on  bounds  on  satisfying  solutions  of  integer  linear  programs.  An  integer  linear 
program  (ILP)  is  a  conjunction  of  linear  constraints,  and  hence  is  a  special  kind  of  QFP  formula. 
The  bounds  for  QFP  follow  directly  from  those  for  ILPs.  Our  main  theoretical  results  are  presented 
in  Sections  3-5.  Section  3  gives  bounds  for  ILPs  for  the  case  of  k  =  0,  when  all  constraints  are 
separation  constraints.  In  Section  4,  we  compute  a  bound  for  ILPs  for  arbitrary  k.  In  Section  5, 
we  show  how  our  results  extend  to  arbitrary  QFP  formulas.  We  report  on  experimental  results  in 
Section  6,  and  conclude  in  Section  7. 

4Note  that  automata-based  techniques  can  handle  full  Presburger  arithmetic,  not  just  the  quantifier-free  fragment. 
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2  Background 


In  this  section,  we  define  the  integer  linear  programming  problem  formally  and  state  the  previous 
results  on  bounding  satisfying  solutions  of  ILPs.  A  more  detailed  discussion  on  the  steps  outlined 
in  Section  2.1  can  be  found  in  reference  books  on  ILP  (e.g.  [29,  24]).  Useful  results  on  determinants 
used  in  the  paper  are  reviewed  in  Appendix  B. 

2.1  Preliminaries 

Consider  a  system  of  m  linear  constraints  in  n  integer- valued  variables: 

Ax  >  b  (1) 

Here  A  is  an  m  x  n  matrix  with  integral  entries,  b  is  a  m  x  1  vector  of  integral  entries,  and  x  is  a 
n  x  1  vector  of  integer- valued  variables.  A  satisfying  solution  to  system  (1)  is  an  evaluation  of  x 
that  satisfies  (1). 

In  system  (1),  the  entries  in  x  can  be  negative.  We  can  constrain  the  variables  to  be  non-negative 
by  adding  a  dummy  variable  xo  that  refers  to  the  “zero  value,”  replacing  each  original  variable  x* 
by  x\  —  xo,  and  then  adjusting  the  coefficients  in  the  matrix  A  to  get  a  new  constraint  matrix  A' 
and  the  following  system:5 


A'x'  >  b 
x  >  0 


(2) 


Here  the  system  has  n'  =  n  +  1  variables,  and  x;  =  [x^,  x2, . . .  ,  x'n,  xo]T.  A'  has  the  structure  that 
a'i  j  =  ai,j  f°r  J  =  1,2,...  ,n  and  a'  n+1  =  —  Y^j=i  ai,j-  Note  that  the  last  column  of  A'  is  a  linear 
combination  of  the  previous  n  columns.  As  shown  in  Proposition  1  in  Appendix  A,  system  (1)  has 
a  solution  if  and  only  if  system  (2)  has  one. 

Finally,  adding  surplus  variables  to  the  system,  we  can  rewrite  system  (2)  as  follows: 


A"x"  =  b 
x"  >  0 


(3) 


where  A"  =  [A|  —  Im]  is  an  m  x  ( n 1  +  m)  integer  matrix  formed  by  concatenating  A  with  the 
negation  of  the  m  x  m  identity  matrix  Im. 

For  convenience  we  will  drop  the  primes,  referring  to  A"  and  x."  simply  as  A  and  x.  Rewriting 
system  (3)  thus,  we  get 


Ax  =  b 
x  >  0 


(4) 


Hereafter  we  will  use  the  definition  in  (4).  Let  [ia  =  max(ij)  \ai,j\  an<4  Mfc  =  maxt  |^|  be  upper 
bounds  on  the  absolute  values  of  entries  of  A  and  b  respectively. 

5Note  that  this  procedure  can  increase  the  width  of  a  constraint  by  1.  The  statistics  in  Table  1  shows  the  width 
before  this  procedure  is  applied,  computed  from  constraints  as  they  appear  in  the  original  formulas. 
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2.2  Previous  Results 


The  results  of  this  paper  build  on  results  obtained  by  Borosh,  Treybig,  and  Flahive  [6,  5]  on 
bounding  the  solution  of  systems  of  the  form  (4).  We  state  their  result  in  the  following  theorem: 

Theorem  1  Consider  the  augmented  matrix  [A\b\  of  dimension  m  x  (n1  +  m  +  1).  Let  A  be  the 
maximum  of  the  absolute  values  of  all  minors  of  this  augmented  matrix.  Then,  the  system  (4)  has 
a  satisfying  solution  if  and  only  if  it  has  one  with  all  entries  bounded  by  (n  +  2)A. 

However,  note  that  the  determinant  of  a  matrix  can  be  more  than  exponential  in  the  dimension  of 
the  matrix  [7].  In  the  case  of  the  Borosh-Flahive- Treybig  result,  it  means  that  A  can  be  as  large 
as  - - - ,  where  /i  =  rna x{fj.A,hbl- 

Papadimitriou  [23,  24]  also  gives  a  bound  of  similar  size,  stated  in  the  following  theorem: 

Theorem  2  If  the  ILP  of  (4)  has  a  satisfying  solution,  then  it  has  a  satisfying  solution  where  all 
entries  in  the  solution  vector  are  bounded  by  ( n '  +  ?n)(l  +  /.ib)(mfiA)2m+3 ■ 

Papadimitriou’s  bound  implies  that  we  need  0(log  m+log  ^5+m[log  m+log  /a a])  bits  to  encode  each 
variable  (assuming  n'  =  O(m)).  The  Borosh-Flahive-1 Treybig  bound  implies  needing  0(m[log  m  + 
log  /u])  bits  per  variable,  which  is  of  the  same  order. 


3  Bounds  for  a  System  of  Separation  Constraints 

Let  us  first  consider  computing  solution  bounds  for  an  ILP  for  the  case  where  k  =  0,  i.e. ,  system  (4) 
comprises  only  of  separation  constraints. 

In  this  case,  the  left-hand  side  of  each  equation  comprises  exactly  three  variables:  two  variables  Xi 

J.L 

and  Xj  where  0  <  i,j  <  n  and  one  surplus  variable  xi  where  n  +  1  <  l  <  n  +  m.  The  tin  equation 
in  the  system  is  of  the  form  x*  —  Xj  —  xi  =  bt- 

As  we  noted  in  Section  2.1,  the  matrix  A  can  be  written  as  [A0\  —  Im]  where  Aa  comprises  the  first 
n!  =  n  +  1  columns,  and  Im  is  the  m  x  m  identity  matrix. 

The  important  property  of  Aa  is  that  each  row  has  exactly  one  +1  entry  and  exactly  one  —  1 
entry,  with  all  other  entries  0.  Thus,  Aj  can  be  interpreted  as  the  node- arc  incidence  matrix  of  a 
directed  graph.  Therefore,  A „  is  totally  unimodular  (TUM),  i.e.,  every  square  submatrix  of  A ^  has 
determinant  in  {0,  —1,  +1}  [24],  Therefore,  Aa  is  TUM,  and  so  is  A  =  [Aa\  —  Im\. 

Now,  let  us  consider  using  the  Borosh-Flahive- Treybig  bound  stated  in  Theorem  1.  This  bound  is 
stated  in  terms  of  the  minors  of  the  matrix  [A\b] .  For  the  special  case  of  this  section,  we  have  the 
following  bound  on  the  size  of  any  minor: 

Theorem  3  The  absolute  value  of  any  minor  of  [A|6]  is  bounded  above  by  s/ib,  where  s  =  min(n  + 
1,  m). 

Proof: 

Consider  any  minor  M  of  [A  \  b] .  Let  r  be  the  order  of  M. 
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If  the  minor  is  obtained  by  deleting  the  last  column  (corresponding  to  b ),  then  it  is  a  minor  of 
A,  and  its  value  is  in  {0,  — 1,+1}  since  A  is  TUM.  Thus,  the  bound  of  s/i b  is  attained  for  any 
non-trivial  minor  with  s  >  1  and  pb  >  1. 

Suppose  the  b  column  is  not  deleted. 

First,  note  that  the  matrix  A  is  of  the  form  [Aa\ — Im\  where  the  rank  of  A0  is  at  most  s'  =  nrin(n,  m). 
This  is  because  A0  has  dimensions  m  x  n  +  1,  and  the  last  column  of  A0,  corresponding  to  the 
variable  xq,  is  a  linear  combination  of  the  previous  n  columns.6 

Next,  suppose  the  sub-matrix  corresponding  to  M  comprises  p  columns  from  the  —Im  part,  r—p—1 
columns  from  the  A0  part,  and  the  one  column  corresponding  to  b.  Since  permuting  the  rows  and 
columns  of  M  does  not  change  its  absolute  value,  we  can  permute  the  rows  of  M  and  the  columns 
corresponding  to  the  —Im  part  to  get  the  corresponding  sub-matrix  in  the  following  form: 


0  ... 

0 

-1 

K  ' 

0  ... 

-1 

0 

^2 

Ao 

part 

-1  ... 

0 

0 

K 

0  ... 

0 

0 

^tp+ 1 

0  ... 

0 

0 

K  . 

Expanding  M  along  the  last  column,  we  get 

\M\  =  \btlMi  -  bt2M2  +  bt3M3  (—l)r~1btrMr\ 
where  each  Mi  is  a  minor  corresponding  to  a  submatrix  of  A. 

However,  notice  that  Mj  =  0  for  all  1  <  i  <  p,  since  each  of  those  minors  have  an  entire  column 
(from  the  —Im  part)  equal  to  0.  Therefore,  we  can  reduce  the  right-hand  side  to  the  sum  of  r  —  p 
terms: 

1-^1  <  \btp+1Mp+i\  +  \btp+2Mp+2\  +  . . .  \btrMr\ 

Notice  that,  so  far,  we  have  not  made  use  of  the  special  structure  of  A. 

Now,  observing  that  A  is  TUM,  |Mj|  <  1  for  all  i. 

\M\  <  \btp+1\  +  \btp+2  \  +  •  •  •  +  |6tr| 

For  all  i,  \btt\  <  iMt-  Further,  since  each  non-zero  Mi  can  be  of  order  at  most  s',  r  —  p  <  s  = 
mines'  +  1,  m).7  Therefore,  we  get 

\M\  <  spb 

□ 

Using  the  terminology  of  Theorem  1,  we  have  A  <  sp^-  Thus,  the  bound  in  this  case  is  (n  +  2)spb- 
Thus,  S.  the  bound  on  the  number  of  bits  per  variable,  is 

flog(n  +  2)  +  log  s  +  log  nb] 

®Rcfer  to  the  construction  of  system  (2)  from  system  (1). 

7We  use  s'  +  1  and  not  s'  to  account  for  the  case  where  p  =  0.  The  minimum  with  m  is  taken  because  s'  +  1  can 
exceed  m  but  b  has  only  m  elements. 
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Formulas  generated  from  verification  problems  tend  to  be  overconstrained,  so  we  assume  n  <  m. 
Thus,  s  =  n  +  1,  and  the  bound  reduces  to  0(log  n  +  log  pb)  bits  per  variable. 

Remark.  The  only  property  of  the  A  matrix  that  the  proof  of  Theorem  3  relies  on  is  the  totally 
unimodular  (TUM)  property.  Thus,  Theorem  3  would  also  apply  to  any  system  of  linear  constraints 
whose  coefficient  matrix  is  TUM.  Examples  of  such  matrices  include  interval  matrices,  or  more 
generally  network  matrices.  Note  that  the  TUM  property  can  be  tested  for  in  polynomial  time  [29]. 


4  Bounds  for  a  Sparse  System  of  Mainly  Separation  Constraints 

We  now  consider  the  general  case  for  ILPs,  where  we  have  k  non-separation  constraints,  each 
referring  to  at  most  w  variables. 

Without  loss  of  generality,  we  can  reorder  the  rows  of  matrix  A  so  that  the  k  non-separation 
constraints  are  the  top  k  rows,  and  the  separation  constraints  are  the  bottom  m—k  rows.  Reordering 
the  rows  of  A  can  only  change  the  sign  of  any  minor  of  [A]  6],  not  the  absolute  value.  Thus,  the 
matrix  [A| b]  can  be  put  into  the  following  form: 


1 

bi 

Im 

b2 

1 

to 

bm_ 

Here,  A\  is  a  k  x  n  +  1  dimensional  matrix  corresponding  to  the  non-separation  constraints,  A 2 
is  am  —  Ixnf  1  dimensional  matrix  with  the  separation  constraints,  Im  is  the  m  x  m  identity 
corresponding  to  the  surplus  variables,  and  the  last  column  is  the  vector  b. 

The  matrix  comprised  of  A\  and  A2  will  be  referred  to,  as  before,  as  Aa.  Note  that  each  row  of 
A\  has  at  most  w  non-zero  entries,  and  each  row  of  A 2  has  exactly  one  +1  and  one  —1  with  the 
remaining  entries  0.  Thus,  A 2  is  TUM. 

We  prove  the  following  theorem: 

Theorem  4  The  absolute  value  of  any  minor  of  [ A\b ]  is  bounded  above  by  s  pb{pAw)k ,  where 
s  =  min(n  +  1,  m). 

Proof: 

Consider  any  minor  M  of  [A\b\,  and  let  r  be  its  order. 

As  in  Theorem  3,  if  M  includes  p  columns  from  the  —Im  part  of  A,  then  we  can  infer  that  r  —  p<s. 
(Our  proof  of  this  property  in  Theorem  3  made  no  assumptions  on  the  form  of  Aa.) 

If  M  includes  the  last  column  b,  then  as  in  the  proof  of  Theorem  3,  we  can  conclude  that 

\M\  <  (r  —  p)  pb  [max  \Mj\]  (5) 

j=1 

where  Mj  is  a  minor  of  A0. 

If  M  does  not  include  b,  then  it  is  a  minor  of  A.  Without  loss  of  generality,  we  can  assume  that  M 
does  not  include  a  column  from  the  —  Im  part  of  A,  since  such  columns  only  contribute  to  the  sign 
of  the  determinant. 
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So,  let  us  consider  bounding  a  minor  Mj  of  Aa  of  order  r  (or  r  —  1,  if  M  includes  the  b  column). 

Since  Aa  =  ^  ,  consider  expanding  Mj,  using  the  standard  determinant  expansion  by  minors 

along  the  top  k  rows  corresponding  to  non-separation  constraints  (see  Equation  8  in  Appendix  B). 
Each  term  in  the  expansion  is  (up  to  a  sign)  the  product  of  at  most  k  entries  from  the  A\  portion, 
one  from  each  row,  and  a  minor  from  A 2.  Since  A 2  is  TUM,  each  product  term  is  bounded  in 
absolute  value  by  pkA.  Furthermore,  there  can  be  at  most  wk  non-zero  terms  in  the  expansion,  since 
each  non-zero  product  term  is  obtained  by  choosing  one  non-zero  element  from  each  of  the  rows  of 
the  Ai  portion  of  Mj,  and  this  can  be  done  in  at  most  wk  ways. 

Therefore,  \Mj\  is  bounded  by  (pAw)k.  Combining  this  with  the  inequality  (5),  and  since  1 — p  <  s, 
we  get 

\M\  <  spb  {pAw)k 


which  is  what  we  set  out  to  prove.  □ 

Thus,  we  conclude  that  A  <  spb{pAw)k ,  where  s  =  min(n  +  1  ,m).  From  Theorems  1  and  4,  the 
solution  bound  is  (n  +  2) A.  Thus,  S  is 


riog(n  +  2)  +  log  s  +  log  pb  +  &(log  pa  +  log  tc)~| 


We  make  the  following  observations  about  the  bound  derived  above,  assuming  as  before,  that 
n  <  m,  and  so  s  =  n  +  1: 

•  Dependence  on  Parameters:  We  observe  that  the  bound  is  linear  in  k,  logarithmic  in  pa,  w, 
n,  and  pb ■  In  particular,  the  bound  is  independent  of  the  total  number  of  linear  constraints, 

m. 

•  Worst-case  Asymptotic  Growth  In  the  worst  case,  k  =  m,  w  =  n  +  1,  and  n  =  0(m),  and  we 
get  the  0(log  m  +  log  pb  +  m[log  m  +  log  pa])  bound  of  Papadimitriou. 

•  Typical-case  Asymptotic  Growth:  As  observed  in  Section  1,  w  is  typically  a  small  constant, 
so  the  number  of  bits  needed  per  variable  is  0(log  n  +  log  pb  +  k  log  pa  +  k).  In  many  cases, 
PA  is  also  a  small  constant,  simplifying  the  bound  to  0(logn  +  log  pb  +  k)  bits  per  variable. 

•  Representing  Non- separation  Constraints:  There  are  many  ways  to  represent  non-separation 
constraints  and  these  have  an  impact  on  the  bound  we  derive.  In  particular,  it  is  possible 
to  transform  a  system  of  non-separation  constraints  to  one  with  at  most  three  variables  per 
constraint.  For  example,  the  linear  constraint  x\  +  X2  +  x%  +  X4  =  X5  can  be  rewritten  as: 

X\  +  x\  =  x5 
X2  +  x'2  =  x\ 

X3  +  X4  =  X'2 

For  the  original  representation,  k  =  1  and  w  =  5,  while  for  the  new  representation  k  =  3  and 
w  =  3.  Since  our  bound  is  linear  in  k  and  logarithmic  in  w,  the  original  representation  would 
yield  a  tighter  bound. 

Similarly,  one  can  eliminate  variables  with  coefficients  greater  than  1  in  absolute  value  by  a 
similar  process  of  adding  new  non-separation  constraints.  Again,  since  the  bound  is  logarith¬ 
mic  in  pa ,  it  would  be  preferable  to  avoid  adding  new  non-separation  constraints. 


The  derived  bound  only  yields  benefits  in  the  case  when  the  system  has  few  non-separation  con¬ 
straints  which  themselves  are  sparse.  In  this  case,  we  can  instantiate  variables  over  a  finite  domain 
that  is  much  smaller  than  that  obtained  without  making  any  assumptions  on  the  structure  of  the 
system. 


5  Bounds  for  Arbitrary  Quantifier-Free  Presburger  Formulas 

We  now  return  to  the  original  goal  of  this  paper,  that  of  finding  a  solution  bound  for  an  arbitrary 
QFP  formula  <h.  Suppose  that  has  m  linear  constraints  (j>i,(j>2,  ■  ■  ■  ,  ,  of  which  m  —  k  are  sep¬ 

aration  constraints,  and  n  variables  x\,X2,  ■  ■  ■  ,xn.  As  before,  we  assume  that  each  non-separation 
constraint  has  at  most  w  variables,  ha  is  the  maximum  over  the  absolute  values  of  coefficients 
aj  j  of  variables,  and  m,  is  the  maximum  over  the  absolute  values  of  constants  hi  appearing  in  the 
constraints. 

We  prove  the  following  theorem. 

Theorem  5  If  $  is  satisfiable,  there  is  a  solution  to  that  is  bounded  by  (n  +  2)  A  where 

A  =  s(nb  +  1)  {nAw)k 


and  s  =  min(n  +  1 ,  m) . 

Proof:  Let  a  be  a  (concrete)  model  of  <h.  Let  m'  constraints,  ,  f>i2 , . . .  ,4>i  , ,  evaluate  to  true 
under  a,  the  rest  evaluating  to  false.  Let  A!  =  [ctjj]  be  a  m'  x  n  matrix  in  which  each  row  comprises 
the  coefficients  of  variables  x\,X2, . . .  ,  xn  in  a  constraint  <j>ik,  1  <  k  <  m' .  Thus,  A'  =  where 
i  £  {ii , . . .  ,  im / } . 

Now  consider  a  constraint  f>ik  where  k  >  m' ,  that  evaluates  to  false  under  a.  f>ik  is  the  inequality 

n 

aikjxj  —  bik 

3= 1 

Then  a  satisfies  —> (fik  which  is  the  inequality 

n 

aik,jxj  <  bik 

3= 1 


or  equivalently, 

n 

a">kdXj  —  —  bik  +  1 

3=1 

Let  A"  be  a  ( m  —  m ')  x  n  matrix  corresponding  to  the  coefficients  of  variables  in  constraints  _,|^)im/+1) 
2’  ■  Thus)  A"  =  [~ai,j\  where  %  £  {im'+ 1,  •  •  •  ,*m}- 

Finally,  let  b=  [bh,bi2,..,  ,  bim, ,  -bim,+1  +  1,  -&im#+2  +  1, . . .  ,~bim  +  1}T 
Clearly,  a  is  a  satisfying  solution  to  the  ILP  given  by 

[#]  x  >  b  (6) 
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Also,  if  the  system  (6)  has  a  satisfying  solution  then  $  is  satisfied  by  that  solution.  Thus,  $  and  the 
system  (6)  are  equi-satisfiable,  for  every  possible  system  (6)  we  construct  in  the  manner  described 
above. 

By  Theorems  1  and  4,  we  can  conclude  that  if  system  (6)  has  a  satisfying  solution,  it  has  one 
bounded  by  (n  +  2) A  where 

A  =  s(nb  +  1)  {nAw)k 

and  s  =  min(n  +  1 ,  m).  Moreover,  this  bound  works  for  every  possible  system  (6). 

Therefore,  if  $  has  a  satisfying  solution,  it  has  one  bounded  by  (n  +  2) A.  □ 

Thus,  to  generate  the  Boolean  encoding  of  the  starting  QFP  formula,  we  must  encode  each  integer 
variable  as  a  symbolic  bit-vector  of  length  S  =  |dog[(n  +  2) A]]  =  |"log(n  +  2)  +  log  s  +  log (^5  +  1)  + 
k( log  ha  +  log  w)l  • 

Remark.  In  the  preceding  discussion,  we  have  used  a  single  bit-vector  length  for  all  integer 
variables  appearing  in  the  formula  $.  This  is  conservative.  In  general,  we  can  partition  the 
set  of  variables  into  classes  such  that  two  variables  are  placed  in  the  same  class  if  there  is  a 
constraint  in  which  they  both  appear  with  non-zero  coefficients.  For  each  class,  we  separately 
compute  parameters  n,  k,  Hb,  HA,  and  w,  resulting  in  a  separately  computed  bit-vector  length  for 
each  class.  The  correctness  of  this  partitioning  optimization  follows  from  a  reduction  to  ILP  as 
performed  in  the  proof  of  Theorem  5,  and  the  observation  that  a  satisfying  solution  to  a  system 
of  ILPs,  no  two  of  which  share  a  variable,  can  be  obtained  by  solving  them  independently  and 
concatenating  the  solutions. 


6  Implementation  and  Experimental  Results 

We  used  the  bound  derived  in  the  previous  section  to  implement  a  decision  procedure  based  on 
finite  instantiation.  Integer  variables  in  the  QFP  formula  are  encoded  as  symbolic  bit-vectors  large 
enough  to  express  any  integer  value  within  the  bound.  Arithmetic  operators  are  implemented 
as  arbitrary-precision  bit- vector  arithmetic  operations.  Equalities  and  inequalities  over  integer 
expressions  are  translated  to  corresponding  relations  over  bit- vector  expressions.  The  resulting 
Boolean  formula  is  passed  as  input  to  a  SAT  solver. 

We  implemented  our  procedure  as  part  of  UCLID8,  which  is  written  in  Moscow  ML9.  In  our  imple¬ 
mentation  we  used  the  zChaff  SAT  solver10,  version  2003.7.22.  We  compared  UCLID’s  performance 
with  that  of  the  SAT-based  prover  ICS  (the  latest  version  2.0) 11  and  the  automata-based  procedure 
LASH12.  While  LASH  is  sound  and  complete  for  QFP,  ICS  2.0  is  incomplete;  i.e.,  it  can  report 
a  formula  to  be  satisfiable  when  it  is  not.  The  ground  decision  procedure  ICS  uses  is  the  Sim¬ 
plex  linear  programming  algorithm  with  some  additional  heuristics  to  deal  with  integer  variables. 
However,  in  our  experiments,  both  UCLID  and  ICS  returned  the  same  answer  whenever  they  both 
terminated  within  the  timeout.13 

shttp : / /www . cs . emu . edu/'uclid 

9http : //www.dina. dk/~sestoft/mosml .html 
10http ://ee .princeton.edu/~chaff/zchaff .php 
nhttp :  / /www .  icansolve  .  com 

“http: //www. montef iore . ulg. ac . be/ 'boigelot /research/lash 

13We  also  attempted  comparisons  with  CVC-Lite  (the  new  version  of  CVC  which  includes  a  ground  decision 
procedure  for  QFP  [3]).  However,  the  implementation  was  too  unstable  to  be  able  to  make  useful  comparisons.  We 
intend  to  perform  a  comparative  evaluation  when  a  stable  implementation  become  available. 
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For  benchmarks,  we  used  several  formulas  from  the  Wisconsin  Safety  Analyzer  project  on  checking 
format  string  vulnerabilities.  The  benchmarks  include  both  satisfiable  and  unsatisfiable  formulas 
in  an  extension  of  QFP  with  uninterpreted  functions.  Uninterpreted  functions  were  first  eliminated 
using  Ackermann’s  technique  [1],  and  the  decision  procedures  were  run  on  the  resulting  QFP 
formula.  Some  characteristics  of  the  formulas  are  displayed  in  Table  2.  For  each  formula,  we 
indicate  whether  it  is  satisfiable  or  not,  and  also  give  the  values  of  parameters  n,  m,  k,  w,  ha  and 
Hb  corresponding  to  the  variable  class  for  which  S  =  [log[(ra+2)A]]  is  largest,  i.e,  for  which  we  need 
the  largest  number  of  bits  per  variable.  Note  that  the  total  numbers  of  variables  and  constraints, 
for  all  variable  classes,  are  larger:  For  example,  for  the  benchmark  xs-30-40,  the  formula  has  115 
variables  and  2610  constraints  in  all.  The  formulas  involve  the  combination  of  linear  constraints  by 
arbitrary  Boolean  operators  (A,  V,  -i).  The  key  characteristics  of  formulas  generated  in  this  class 
of  problems  is  that  they  vary  in  n,  m,  and  fib-,  but  the  values  of  k.  w,  and  ha  are  fixed  at  a  small 
value. 

Experiments  were  performed  on  a  Pentium-IV  2  GHz  machine  with  1  GB  of  RAM  running  Linux. 
A  timeout  of  900  seconds  was  imposed  on  each  run. 


Formula 

Ans. 

Max.  Parameters 

UCLID  Time 
(sec.) 

ICS 

n 

m 

k 

w 

HA 

Hb 

S 

#(Inc. 

assn.) 

Time  (sec.) 

Enc. 

SAT 

Total 

Gnd. 

Total 

s-20-20 

SAT 

28 

437 

6 

5 

4 

21 

41 

8.98 

5.86 

14.84 

904 

23.32 

23.76 

s-20-30 

SAT 

28 

437 

6 

5 

4 

30 

41 

9.02 

26.01 

35.03 

1887 

51.68 

52.29 

s-20-40 

UNS 

28 

437 

6 

5 

4 

40 

41 

9.18 

363.70 

372.88 

25776 

618.59 

633.58 

s-30-30 

SAT 

38 

792 

6 

5 

4 

31 

42 

11.93 

12.29 

24.22 

2286 

268.21 

269.42 

s-30-40 

SAT 

38 

792 

6 

5 

4 

40 

42 

12.00 

54.50 

66.50 

7311 

860.71 

* 

xs-20-20 

SAT 

49 

668 

6 

5 

4 

21 

42 

10.23 

13.21 

23.44 

2307 

91.31 

92.87 

xs-20-30 

SAT 

49 

668 

6 

5 

4 

30 

43 

10.48 

26.64 

37.12 

15656 

765.44 

* 

xs-20-40 

- 

49 

668 

6 

5 

4 

40 

43 

10.49 

* 

* 

20590 

867.00 

* 

xs-30-40 

SAT 

69 

1288 

6 

5 

4 

40 

44 

17.71 

33.68 

51.39 

9927 

890.08 

* 

Table  2:  Benchmark  characteristics  and  experimental  results.  For  UCLID,  we  list  the  time 
taken  to  decide  the  formula  including  a  breakup  into  the  encoding  time  ( “Enc.” )  and  the  time  taken 
by  the  SAT  solver  (“SAT”).  For  ICS,  we  give  the  total  time,  the  number  of  inconsistent  Boolean 
assignments  analyzed  by  the  ground  decision  procedure  (“#(Inc.  assn.)”),  as  well  as  the  overall 
time  taken  by  the  ground  decision  procedure  (“Gnd.”).  A  indicates  that  the  decision  procedure 
timed  out  after  900  sec.  LASH  was  unable  to  complete  within  the  timeout  on  any  formula. 

A  comparison  of  UCLID  versus  ICS  is  displayed  in  Table  2.  LASH  was  unable  to  complete  on  any 
benchmark  within  the  timeout;  we  attribute  this  to  the  relatively  large  number  of  variables  and 
constraints  in  our  formulas,  and  note  that  Ganesh  et  al.  obtained  similar  results  in  their  study  [14]. 
From  Table  2,  we  observe  that  UCLID  outperforms  ICS  on  all  benchmarks,  terminating  within  the 
timeout  on  several  benchmarks  on  which  ICS  does  not. 

The  reason  for  UCLID’s  superior  performance  is  the  formula  structure,  where  k,  w,  and  ha  remain 
fixed  at  a  low  value  while  m,  n,  and  Hb  increase.  Thus,  the  maximum  number  of  bits  per  variable  is 
only  moderately  large  (about  40),  even  as  m  increases  substantially,  and  the  resulting  SAT  problem 
is  within  the  capacity  of  zChaff.  Also,  we  note  that  UCLID’s  run-time  is  dominated  by  the  SAT 
time,  since  the  time  to  compute  the  parameter  values  and  generate  the  SAT-encoding  is  polynomial 
in  the  input  size. 

For  ICS,  we  note  that  the  run-time  is  dominated  by  the  time  taken  by  the  ground  decision  procedure. 
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We  observe  that  the  number  of  inconsistent  Boolean  assignments  alone  is  not  a  precise  indicator  of 
total  run-time,  which  also  depends  on  the  time  taken  by  the  ground  decision  procedure  in  ruling 
out  a  single  Boolean  assignment. 


7  Conclusions  and  Future  Work 

In  this  paper,  we  have  presented  a  formal  approach  to  exploiting  the  “sparse,  mainly  separation 
constraint”  nature  of  quantifier-free  Presburger  formulas  encountered  in  software  verification.  Our 
approach  is  based  on  deriving  a  new  parameterized  bound  on  satisfying  solutions  to  QFP  formu¬ 
las.  Experimental  results  show  the  benefits  of  using  the  derived  bound  in  a  SAT-based  decision 
procedure  based  on  finite  instantiation. 

Note  that  the  bounds  we  have  derived  and  used  in  our  experiments  are  conservative.  First,  the  size 
of  minors  in  a  particular  problem  instance  might  be  far  smaller  than  the  bounds  we  have  computed. 
It  is  unclear  how  this  can  be  exploited,  since  there  are  exponentially  many  minors  in  the  dimensions 
of  the  input  matrix.  Second,  for  certain  special  cases,  one  can  improve  the  ( n  +  2)  A  bound.  For 
example,  if  all  the  constraints  are  originally  equalities  and  the  system  of  constraints  has  full  rank, 
a  bound  of  A  suffices  [4] .  Thirdly,  in  cases  where  the  value  of  /ifj  is  very  large  due  to  the  presence 
of  a  single  large  constant,  one  might  want  to  use  a  less  conservative  analysis  than  is  performed  in 
the  proof  of  Theorem  4. 

In  our  implementation,  we  translate  a  QFP  formula  to  a  Boolean  formula  in  a  single  step.  An 
alternative  approach  is  to  perform  this  transformation  lazily ,  increasing  the  bit-vector  size  “on 
demand”.  This  lazy  encoding  approach  works,  in  brief,  as  follows.  (Details  can  be  found  in  [18].) 
We  start  with  an  encoding  size  for  each  integer  variable  that  is  smaller  than  that  prescribed  by 
the  bound.  If  the  resulting  Boolean  formula  is  satisfiable,  so  is  the  original  QFP  formula.  If  not, 
the  proof  of  unsatisfiability  generated  by  the  SAT  solver  is  used  to  generate  a  sound  abstraction  of 
the  original  formula,  which  can  be  checked  with  a  sound  and  complete  decision  procedure  for  QFP 
(such  as  the  one  proposed  in  this  paper).  If  this  decision  procedure  concludes  that  the  abstraction 
is  unsatisfiable,  so  is  the  original  formula,  but  if  not,  it  provides  a  counterexample  which  indicates 
the  necessary  increase  in  the  encoding  size,  and  the  procedure  repeats.  The  advantage  of  this  lazy 
approach  is  twofold:  (1)  It  avoids  using  the  conservative  bounds  we  have  derived  in  this  paper, 
and  (2)  if  the  generated  abstractions  are  small,  the  sound  and  complete  decision  procedure  used  by 
this  approach  will  run  much  faster  than  if  it  were  fed  the  original  formula.  The  bound  S  that  we 
derive  in  this  paper  implies  an  upper  bound  nS  on  the  number  of  iterations  of  this  lazy  encoding 
procedure;  thus  the  lazy  encoding  procedure  needs  only  polynomially  many  iterations  before  it 
terminates  with  the  correct  answer.  Using  the  decision  procedure  proposed  in  this  paper  with  the 
above  lazy  encoding  approach  is  an  interesting  avenue  for  future  work. 

Finally,  it  would  also  be  interesting  to  explore  applications  other  than  software  verification  that 
share  the  “sparse,  mainly  separation  constraints”  property. 
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A  Proof  of  Transformation  to  System  2 

Proposition  1  System  (1)  has  a  solution  if  and  only  if  system  (2)  has  one. 

Proof:  For  the  “if  part”,  suppose  we  have  a  solution  x7  to  (2).  Construct  a  candidate  solution 
vector  x  by  setting  Xj  =  x7-  —  xq.  Then,  consider  the  zth  constraint  in  A7,  for  any  i.  The  following 
sequence  of  equalities  holds: 

n 

C^2ai,jX'j)  +  ai,n+ 1*0  >  bi 

3= 1 

n  n 

ai,3xj )  +  xo(~  ^2  ai,j )  > 

3= 1  3= 1 

n 

y,  aijjx'i  -  Xq)  >  bi 
3= 1 

n 

^  [  U  /  .  jt'  X  j,'  T  b^ 

i=i 

Thus,  we  can  conclude  that  the  ith  constraint  of  A  is  satisfied  by  x  for  all  i.  Thus,  we  have  found 
a  solution  to  system  (1). 

Now  consider  the  “only  if”  part,  where  we  start  with  a  solution  to  system  (1).  Clearly,  any 
value  of  x7  that  sets  x7  =  Xj  +  xo  for  all  j  will  satisfy  Ax7  >  b.  But  we  also  need  to  satisfy 
x7  >  0.  If  none  of  the  Xj  are  negative,  then  simply  set  x7-  =  Xj  and  xo  =  0  and  we  are  done. 
Otherwise,  set  xo  =  —  mmktXk<oXk,  and  set  x7-  =  Xj  +  xq.  Note  that  xo  >  0  by  construction. 
Thus,  if  for  a  particular  j,  Xj  >  0,  then  x7-  >  0.  Suppose  not.  Then,  Xj  >  minftiXfc<0  x^-  and  so 
x'j  =  Xj  —  minfeiXfc<oXfc  >  0.  Thus,  we  have  a  solution  x7  that  satisfies  (2).  □ 

B  Some  Background  on  Determinants 

We  review  some  useful  results  from  the  theory  of  determinants.  All  these  results  can  be  found  in 
standard  textbooks  (e.g.,  [21,  24]). 

Consider  a  d  x  d  matrix  P,  where  the  (i,j) th  entry  is  denoted  pt)j  in  the  usual  way.  Then,  the  full 
product  expansion  of  its  determinant  |P|  can  be  written  as 

^  '  (  1)  Pl,7r(l)4,2,7r(2)  •  •  •  Pd,n(d)  C^) 

permutations  ir  of  {1,  2, . . .  ,  d} 

where  l(tt)  is  the  number  of  permutation  inversions  (swaps)  in  ir. 

The  (i,j) th  minor  of  a  matrix  P  is  the  determinant  of  the  submatrix  obtained  by  deleting  the  ith 
row  and  the  jth  column  of  P.  A  minor  of  P  of  order  r  is  the  determinant  of  a  square  submatrix  of 
P  of  order  r. 

If  we  only  fully  expand  |P|  along  the  first  k  rows,  we  get  the  equation 

^  ^  (  1)  P1,tt(1)P2,tt(2)  ■  ■  •  Pk,n(k)Pk,TC  (8) 

permutations  n  of  {1, 2, . . .  ,k} 
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where  Pk)7r  is  the  minor  of  P  obtained  by  excluding  the  first  k  rows  of  P  and  the  k  columns 
corresponding  to  7r(l),7r(2), . . .  ,tt (k). 

The  determinant  of  a  matrix  equals  that  of  its  transpose,  i.e.,  |P|  =  | P1 1. 

For  an  arbitrary  d  x  d  matrix  P  with  integral  entries,  we  have  the  following  bound  on  |P|  [7]: 

|P|  <  VPd{d+  l)(d+1)/2 

where  pp  =  ma x/Sj-)  \'Pi.j\-  Equality  is  attained  in  certain  cases. 

A  square,  integer  matrix  P  is  called  unimodular  (UM)  if  |P|  e  {0, +1,— 1}.  P  is  called  totally 
unimodular  (TUM)  if  every  square  submatrix  of  P  is  UM. 

The  node-arc  incidence  matrix  of  a  directed  graph  is  TUM.  This  matrix  has  entries  in  {0,  +1,  —1} 
and  every  column  has  exactly  one  +1  entry  and  one  —1  entry. 

If  P  is  TUM,  then  so  is  PT,  [P\I],  and  [P\  -  I}. 
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